[1]:
import emat
from emat.util.loggers import timing_log
emat.versions()
emat 0.5.2, plotly 4.14.3
Meta-Model Creation¶
To demostrate the creation of a meta-model, we will use the Road Test example model included with TMIP-EMAT. We will first create and run a design of experiments, to have some experimental data to define the meta-model.
[2]:
import emat.examples
scope, db, model = emat.examples.road_test()
design = model.design_experiments(design_name='lhs')
results = model.run_experiments(design)
We can then create a meta-model automatically from these experiments.
[3]:
from emat.model import create_metamodel
with timing_log("create metamodel"):
mm = create_metamodel(scope, results, suppress_converge_warnings=True)
<TIME BEGINS> create metamodel
< TIME ENDS > create metamodel <11.14s>
If you are using the default meta-model regressor, as we are doing here, you can directly access a cross-validation method that uses the experimental data to evaluate the quality of the regression model. The cross_val_scores provides a measure of how well the meta-model predicts the experimental outcomes, similar to an
measure on a linear regression model.
[4]:
with timing_log("crossvalidate metamodel"):
display(mm.cross_val_scores())
<TIME BEGINS> crossvalidate metamodel
| Cross Validation Score | |
|---|---|
| no_build_travel_time | 0.9908 |
| build_travel_time | 0.9917 |
| time_savings | 0.9125 |
| value_of_time_savings | 0.9113 |
| net_benefits | 0.6382 |
| cost_of_capacity_expansion | 0.8978 |
| present_cost_expansion | 0.9461 |
< TIME ENDS > crossvalidate metamodel <14.22s>
We can apply the meta-model directly on a new design of experiments, and use the contrast_experiments visualization tool to review how well the meta-model is replicating the underlying model’s results.
[5]:
design2 = mm.design_experiments(design_name='lhs_meta', n_samples=10_000)
[6]:
with timing_log("apply metamodel"):
results2 = mm.run_experiments(design2)
<TIME BEGINS> apply metamodel
< TIME ENDS > apply metamodel <0.11s>
[7]:
results2.info()
<class 'emat.experiment.experimental_design.ExperimentalDesign'>
Int64Index: 10000 entries, 0 to 9999
Data columns (total 20 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 alpha 10000 non-null float64
1 amortization_period 10000 non-null int64
2 beta 10000 non-null float64
3 debt_type 10000 non-null category
4 expand_capacity 10000 non-null float64
5 input_flow 10000 non-null int64
6 interest_rate 10000 non-null float64
7 interest_rate_lock 10000 non-null bool
8 unit_cost_expansion 10000 non-null float64
9 value_of_time 10000 non-null float64
10 yield_curve 10000 non-null float64
11 free_flow_time 10000 non-null int64
12 initial_capacity 10000 non-null int64
13 no_build_travel_time 10000 non-null float64
14 build_travel_time 10000 non-null float64
15 time_savings 10000 non-null float64
16 value_of_time_savings 10000 non-null float64
17 net_benefits 10000 non-null float64
18 cost_of_capacity_expansion 10000 non-null float64
19 present_cost_expansion 10000 non-null float64
dtypes: bool(1), category(1), float64(14), int64(4)
memory usage: 1.5 MB
[8]:
from emat.analysis import contrast_experiments
contrast_experiments(mm.scope, results2, results)
No Build Time
Build Time
Time Savings
Value Time Save
Net Benefits
Cost of Expand
Present Cost
Partial Metamodels¶
It may be desirable in some cases to construct a partial metamodel, covering only a subset of the performance measures. This is likely to be particularly desirable if a large number of performance measures are included in the scope, but only a few are of particular interest for a given analysis. The time required for generating and using meta-models is linear in the number of performance measures included, so if you have 100 performance measures but you are only presently interested in 5, your meta-model can be created much faster if you only include the 5 performance measures. It will also run much faster, but the run time for metamodels is so small anyhow, it’s likely you won’t notice.
To create a partial meta-model for a curated set of performance measures, you can use the include_measures argument of the create_metamodel command.
[9]:
with timing_log("create limited metamodel"):
mm2 = create_metamodel(
scope, results,
include_measures=['time_savings', 'present_cost_expansion'],
suppress_converge_warnings=True,
)
with timing_log("crossvalidate limited metamodel"):
display(mm2.cross_val_scores())
with timing_log("apply limited metamodel"):
results2_limited = mm2.run_experiments(design2)
results2_limited.info()
<TIME BEGINS> create limited metamodel
< TIME ENDS > create limited metamodel <3.33s>
<TIME BEGINS> crossvalidate limited metamodel
| Cross Validation Score | |
|---|---|
| time_savings | 0.8559 |
| present_cost_expansion | 0.9297 |
< TIME ENDS > crossvalidate limited metamodel <5.47s>
<TIME BEGINS> apply limited metamodel
< TIME ENDS > apply limited metamodel <0.05s>
<class 'emat.experiment.experimental_design.ExperimentalDesign'>
Int64Index: 10000 entries, 0 to 9999
Data columns (total 15 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 alpha 10000 non-null float64
1 amortization_period 10000 non-null int64
2 beta 10000 non-null float64
3 debt_type 10000 non-null category
4 expand_capacity 10000 non-null float64
5 input_flow 10000 non-null int64
6 interest_rate 10000 non-null float64
7 interest_rate_lock 10000 non-null bool
8 unit_cost_expansion 10000 non-null float64
9 value_of_time 10000 non-null float64
10 yield_curve 10000 non-null float64
11 free_flow_time 10000 non-null int64
12 initial_capacity 10000 non-null int64
13 time_savings 10000 non-null float64
14 present_cost_expansion 10000 non-null float64
dtypes: bool(1), category(1), float64(9), int64(4)
memory usage: 1.1 MB
There is also an exclude_measures argument for the create_metamodel command, which will retain all of the scoped performance measures except the enumerated list. This can be handy for dropping a few measures that are not working well, either because the data is bad in some way or if the measure isn’t well fitted using the metamodel.
[10]:
with timing_log("create limited metamodel"):
mm3 = create_metamodel(
scope, results,
exclude_measures=['net_benefits'],
suppress_converge_warnings=True,
)
with timing_log("crossvalidate limited metamodel"):
display(mm3.cross_val_scores())
with timing_log("apply limited metamodel"):
results3_limited = mm3.run_experiments(design2)
results3_limited.info()
<TIME BEGINS> create limited metamodel
< TIME ENDS > create limited metamodel <10.62s>
<TIME BEGINS> crossvalidate limited metamodel
| Cross Validation Score | |
|---|---|
| no_build_travel_time | 0.9807 |
| build_travel_time | 0.9736 |
| time_savings | 0.8864 |
| value_of_time_savings | 0.8363 |
| cost_of_capacity_expansion | 0.8896 |
| present_cost_expansion | 0.9454 |
< TIME ENDS > crossvalidate limited metamodel <11.09s>
<TIME BEGINS> apply limited metamodel
< TIME ENDS > apply limited metamodel <0.10s>
<class 'emat.experiment.experimental_design.ExperimentalDesign'>
Int64Index: 10000 entries, 0 to 9999
Data columns (total 19 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 alpha 10000 non-null float64
1 amortization_period 10000 non-null int64
2 beta 10000 non-null float64
3 debt_type 10000 non-null category
4 expand_capacity 10000 non-null float64
5 input_flow 10000 non-null int64
6 interest_rate 10000 non-null float64
7 interest_rate_lock 10000 non-null bool
8 unit_cost_expansion 10000 non-null float64
9 value_of_time 10000 non-null float64
10 yield_curve 10000 non-null float64
11 free_flow_time 10000 non-null int64
12 initial_capacity 10000 non-null int64
13 no_build_travel_time 10000 non-null float64
14 build_travel_time 10000 non-null float64
15 time_savings 10000 non-null float64
16 value_of_time_savings 10000 non-null float64
17 cost_of_capacity_expansion 10000 non-null float64
18 present_cost_expansion 10000 non-null float64
dtypes: bool(1), category(1), float64(13), int64(4)
memory usage: 1.4 MB